Evaluating methods for computer-assisted stemmatology using artificial benchmark data sets

نویسندگان

  • Teemu Roos
  • Tuomas Heikkilä
چکیده

Given a collection of imperfect copies of a textual document, the aim of stemmatology is to reconstruct the history of the text, indicating for each variant the sourcc tcxt from it was copied. We describe an experiment involving three artificial bcnchmark data sets to which a number of computer-assisted stemmatologr mcthods wcre applied. Contrary to earlicr similar expcrimenls, we propose and use a numerical criterion to evaluate all the solutions. Moreover, our primary data set is significantly !arger than used before. The results suggest the superiorit}' of two computer-assisted methods amongst those tested: the maximum parsimony method implemented in the PAUP• software package and a related compression-based method we have proposed in earlier work. 'No book is published without some discrepancy in each one of the copics. intentional modifications. They accumulated in copics of copies, copics of copies of copies, ctc. Consequently, a text of any importance ended up existing in a group of different vcrsions, a so called tmdition, some of which are all but identical to the original, some perhaps hardly recogni7.able. Connccting each version to its exemplar, i.e. the version from which it was copied, gives a tree-like structure called the stcmma, with the original version as the root. The aim of stemmatolog}' is to recover this structure given a set of surviving variants. Scribes tal-e a secret oath to omit, to intcrpolate, to change.' (Jorgc Luis Borges: Tile LoNery in Bab}•lc111, in Labyrinths: Seleded Stories & Otlrcr Writings,

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering of Fuzzy Data Sets Based on Particle Swarm Optimization With Fuzzy Cluster Centers

In current study, a particle swarm clustering method is suggested for clustering triangular fuzzy data. This clustering method can find fuzzy cluster centers in the proposed method, where fuzzy cluster centers contain more points from the corresponding cluster, the higher clustering accuracy. Also, triangular fuzzy numbers are utilized to demonstrate uncertain data. To compare triangular fuzzy ...

متن کامل

A new approach for data visualization problem

Data visualization is the process of transforming data, information, and knowledge into visual form, making use of humans’ natural visual capabilities which reveals relationships in data sets that are not evident from the raw data, by using mathematical techniques to reduce the number of dimensions in the data set while preserving the relevant inherent properties. In this paper, we formulated d...

متن کامل

A Sensor-Based Scheme for Activity Recognition in Smart Homes using Dempster-Shafer Theory of Evidence

This paper proposes a scheme for activity recognition in sensor based smart homes using Dempster-Shafer theory of evidence. In this work, opinion owners and their belief masses are constructed from sensors and employed in a single-layered inference architecture. The belief masses are calculated using beta probability distribution function. The frames of opinion owners are derived automatically ...

متن کامل

Prediction of monthly rainfall using artificial neural network mixture approach, Case Study: Torbat-e Heydariyeh

Rainfall is one of the most important elements of water cycle used in evaluating climate conditions of each region. Long-term forecast of rainfall for arid and semi-arid regions is very important for managing and planning of water resources. To forecast appropriately, accurate data regarding humidity, temperature, pressure, wind speed etc. is required.This article is analytical and its database...

متن کامل

Comparison of Three Instructional Methods for Drug Calculation Skill in Nursing Critical Care Courses: Lecturing, Problem Solving, and Computer-Assisted Self-Learning

Introduction: Due to development of educational systems and importance of education in the nursing profession, the necessity of using appropriate instructional methods for new theoretical and practical skills in students is clear. The purpose of this study is comparing the effects of three methods lecture, problem solving, and computer-assisted self learning on the drug calculation skill on thi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • LLC

دوره 24  شماره 

صفحات  -

تاریخ انتشار 2009